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Abstract: 


This opinion piece is intended to present a few thoughts regarding one potential route for 
artificial intelligence (Al) to emulate emotions as experienced by humans (and presumably 
also by animals). A thought experiment for Als processing synthetic emotions is presented 
by following the assumption that biological emotions might have naturally evolved in order 
to serve as universal shortcuts for quick situational assessments that lead to biologically 
successful actions. The text posits that the very same constraints that lead biological 
evolution to natural emotions may also exert their influence when implementing 
sophisticated Als: synthetic emotional labels stored with previous experiences, and evoked 
in the present when a similar situation arises, would eventually be needed to navigate 
large action spaces arising in realistic scenarios, and to guide any Al facing the difficulties 
of action selection in truly complex environments. Moreover, human interpretable synthetic 
emotions may also become key to solving the alignment problem by ensuring that emotion 
induced action proposals align with human values. After having provided an example 
implementation, the issue whether and how emotion processing by Als may allow for 
notions like “experiencing” emotional states such as (synthetic) “fear” is discussed, leading 
to the conjecture that any need for ethical treatment of emotionally-equipped Als would still 
only become of importance if “awareness” or “consciousness” could be additionally 
ascribed to these machines. The paper argues for the position that the issues of “emotion’”- 
processing and synthetic “consciousness” are a-priori independent. However, “pre”- 
conscious Als may still lead to a plethora of practical consequences and ensuing ethical 
considerations surrounding the uses of such Als and possible reactions of humans 
interacting with such Als. The text is supplemented by an “interview” with a state-of-the art 
LLM highlighting some of these issues from the “perspective” of the LLM, and ending with 
a concrete example of an alignment problem. 


Extended Summary: 


This opinion piece is intended to present a few thoughts regarding one potential route for 
artificial intelligence (Al) to emulate emotions as experienced by humans (and presumably 
also by animals). A thought experiment is presented by following the assumption that 
emotions might have evolved biologically in order to serve as universal shortcuts for quick 
situational assessments that lead to biologically successful actions. In order to fulfil their 
function as quick initial suggestions of appropriate actions, the biological computational 
role of emotions would be to replace long-winded rational analyses about possible 
consequences of different actions and their desirability (detailed world-simulations) with a 
simple comparison of the present situation to past situations, and their associated past 
emotions would be used to provide a first emotional hint for the present. The final 
emotional “colouring” of the present would arise as a combination of (a) these initially 
retrieved past emotions (past triggers), and (b) emotions stemming from immediate needs 
of the organism (from fully met to completely unmet). The combined emotion then 
suggests a reduced set of actions. The final emotion after the completion of a selected 
action combines the emotion before the action with (c) emotions stemming from the 
degree of success of the present action (from full success to complete failure). The 


before/after action emotions are stored with each new experience, which thus receives its 
emotional labelling that thereby becomes yet another past emotion, available for future 
situation-to-action analyses. The search for the most similar situation winds down to a 
search for the most similar key features of situations in the Al’s memory. The most similar 
past situations provide their associated emotional “colouring” to be associated with 
(projected onto) the present situation as initial emotional hint, i.e. the present receives its 
initial emotional “colouring” via the Al’s past experiences which “trigger” emotions due to 
their similarity to the present. While biological entities are hypothesized to use “real” 
(biological encodings of natural) emotions to accomplish this task of projecting past 
emotions onto the present, an Al would employ “synthetic” emotions to be stored together 
with its past experiences. Synthetic emotions may be encoded in different ways: a very 
primitive concrete example may represent emotions as points in the unit cube [0,1]? of a d- 
dimensional real space: points located on any one of the coordinate axes represent “base” 
emotions and their corresponding intensity, while the whole possible set of “mixed” 
synthetic emotions is represented by (“interior”) points not lying exactly along one of the 
coordinate axes. The minimum requirement for the encoding of emotions is that it should 
allow the Al to quickly provide situation-to-action mappings for a wide range of tasks, such 
as controlling human-like artificial characters in computer games, or robots in real-world 
scenarios requiring fast reactions. The issue whether and how emotion processing by Als 
may allow for notions like “experiencing” emotional states such as “fear” is discussed, 
including ensuing ethical considerations surrounding the uses of such Al and possible 
reactions of humans interacting with such Al, in particular also if Als begin to display also 
seemingly conscious behaviour. The text is supplemented by an “interview” with an LLM 
highlighting ethical considerations of emotion-processing (and “conscious”) Al, and ending 
with an example of an Al alignment problem and a related discussion. 


Introduction: 


Recent progress in artificial intelligence has led to high expectations for the future of 
Al. At the same time, it is far from clear how to best navigate the new landscape that is 
about to open up in front of us. Prominent researchers increasingly issue warnings about 
the potential impact on society and humanity itself. No consensus opinion has emerged, 
and, in apparent contrast to previous situations of technological breakthroughs, the speed 
of seemingly open-ended technological advances may well make rules of conduct for Al 
research appear outdated on the day they are formulated. Unparalleled progress and 
disaster seem equally possible. It therefore comes as no surprise that this vast landscape 
of opportunities also leads to a corresponding range of hitherto unresolved questions. The 
present paper addresses the sub-issue of how to emulate emotions’ and its 
consequences. While seemingly only a side-issue (intelligence is after all different from 
emotion) the following considerations will highlight some of the potential and dangers lying 
ahead of us, even when restricting the discussion only to this particular branch of Al, 
whose impact on future Al may be more profound than would be expected from seeing 
emotion Al as a mere tool to improve the social aspects of human Al interaction. 


' For the purposes of this text it is not necessary to distinguish between “emotions” and “feelings”. 


Biological Inspirations: 


In popular culture humans have oftentimes been contrasted with “heartless” artificial 
intelligence on the assumption that humans experience life through emotions and love 
which provide a deeper meaning to (human) existence, and which form a final bulwark of 
humanity (or “good” forces) insurmountable by “cold” machines (for a few 
counterexamples in popular culture see however [3]). 


Ironically, in actual fact current advances in Al research are largely built upon 
biological inspirations. From a purely biological stand-point humans simply possess a 
thinking organ, the brain, which has evolved as part of a successful strategy of out- 
competing direct competitors in terms of numbers of reproductive offspring, thereby 
numerically replacing, or at least clearly outnumbering, such competitors over the course 
of a very large number of generations. Emotions are also “products” of these thinking 
organs. In this framework, the issue of human emotions would necessarily have to receive 
a far more down-to-earth explanation: the brain circuitry required for representing and 
processing emotions entails a biological cost that should be offset by a corresponding 
evolutionary advantage. In other words: emotions may quite likely have evolved in our 
animal ancestors since they imply adaptive benefits that may retrospectively be interpreted 
as serving a biological purpose. 


While there is no unanimous consensus on the actual role and purpose of 
emotions, at least some of the biologically inspired scenarios would be amenable to 
technological implementations. In order to embed the discussion into a concrete example 
that could be implemented by current technology, one may even proceed from biological 
assumptions which do not have to be fully correct but only somewhat plausible, since the 
primary task in Al research is not to explain biological facts but to build Als which are 
capable of processing information "intelligently", in this case information that we would 
consider as representing emotions. Against this backdrop, we may temporarily perform the 
thought-experiment of assuming a strong Al perspective and that 


emotions may be universal amongst animals because they reduce the 
computational burden on brain circuitry needed to proceed from situational 
assessments to biologically successful actions. 


In other words, fewer biological computational resources may be needed when 
emotional “labels” are available to the brain for quickly assessing situations from the 
perspective of what to do next ([5],[7],[8]). The alternative of providing a detailed 
simulation of the consequences of all possible next actions may simply be computationally 
too demanding for a biological (animal) brain. Instead, emotional labels would help animals 
to swiftly proceed from a given (and potentially dangerous) situation to an appropriate (and 
potentially life-saving) reaction. From this perspective, emotions would be nothing but low- 
dimensional encodings of situational assessments and the resulting appraisals of 
previously encountered similar situations. 


The ultimate emotional assessment of a situation may be in part also reflecting the 
degree of meeting (or failing to meet) basic needs. Another, equally important part, 
however, may be due to the described mechanism of intuitively remembering emotions of 
past situations that are similar to the present. The combined emotional content resulting 
for each situation may then be directly and quickly mapped/translated to an initial set of 
suggested re-actions: fear and similar emotions would typically pre-dispose the organism 
to fight, flight or freeze reactions. Pleasurable feelings would lead to approaching and 
interacting with the object evoking such feelings in us: mates, offspring, food etc. Any 
animals additionally capable of some degree of rational post-processing would then have 
these emotional labels as additional information when making a final decision on which 
action to take, possibly even overriding initial and purely emotional suggestions (e.g. one 
may sometimes “act impulsively” but in other cases “feel the fear and do it anyway” upon 
rational re-assessment). 


Moreover, emotional intensity may correspond to the degree to which responses 
are automatic: the responses suggested by highly emotionally charged situations may be 
extremely difficult to override by rational post-processing of the induced emotions. 


Correspondingly, emotions would additionally serve the purpose of preparing an 
animal’s body for the next action, with bodily changes occurring before any subsequent 
rational decision is taken since such a decision takes more time. Such bodily changes 
would be preparatory and should consequently occur with a minimal temporal delay in 
typical natural scenarios: fight or flight reactions would be preceded by unconscious bodily 
changes such as the excretion of cortisol and its related effects. In order to be able to 
serve as quick initial suggestions of possible appropriate actions, the biological 
computational role of emotions would be to shortcut complicated situational assessments. 
One way to implement such a shortcut would be to replace long-winded rational analyses 
about possible consequences of different actions (detailed future world-simulations and 
detailed assessments of their desirability), by a simple comparison of the present situation 
to previous situations. Previously encountered situations would be stored with 
corresponding emotional labels, and the comparison with the present situation would elicit 
their previous emotional labels also in the present. 


In order for this to work, emotions would have to be stored with similar situations in 
the past and be available as an action guide when encountering a particular situation 
again, akin to the principles underlying reinforcement learning ([1], [6], [7], [8]). Emotions 
would thus be an additional label to be stored with (and possibly even to help in organising 
the storage details) of all our biological memories ([9]). They would thus be deeply 
intertwined with the issue which features get stored for which events, and biological 
memories may simply not function properly if emotional processing were impaired. 


In such a scenario also the purpose of dreaming would find a tentative explanation: 
dreams would arise in off-line (sleep) phases, when no quick survival reactions are 


needed, thus making more computational resources available for detailed situational 
assessments. During dream phases the brain would restructure its memories, perform 
repairs and garbage collection, and run more detailed computational world-simulations 
(including re-plays of encounters that happened during the day) in order to re-assess the 
emotional label stored with them and to re-label experiences with more appropriate 
emotions for future actions. These post-processed emotional labels would then be 
available during subsequent on-line (wake) phases in which they would serve as improved 
pre-computed emotional labels for future situations. In case very strong negative emotional 
labels were initially stored during past encounters (“traumatic events”), dream phases may 
also — to some extent — be able to reduce their “burden” (emotional intensity and 
corresponding level of automatic responses) by reshaping the emotional content (labels 
and intensities) of previous experiences associated with overly strong emotional content. 
In other words, such “dreams” would also be able to serve as in-built re-evaluations of 
(e.g. “traumatic”) situations. On the other hand, if the re-labelling fails, overly traumatic 
events may be re-processed over and over again in dream phases attempting to repair 
them (“recurrent nightmares”) and eventually such emotional “mislabelling” may have to be 
corrected by more sophisticated processes of emotional re-labelling - akin to 
psychotherapy for e.g. (Complex) PTSD. It may be surmised that such “therapeutic” 
aspects of dreaming, and the need to avoid strong emotional triggers during dreams, might 
be the reason for some concealing of details (hidden meaning of at least some dreams). 
For recent research results suggesting these scenarios cf. i.a. [14], [15] and [16]. 


Al inspired by Biology: 


When seen in such a light, very near future artificial intelligence may already be 
capable of emulating human or animal processing of emotions. Artificial intelligence 
may use a universal “emotional” labelling (e.g. multiple numerical values, each 
representing the strength of a particular basic emotion in one dimension of a multi- 
dimensional emotional “latent” space) and store such (multi-dimensional numerical) 
labelling with every memory content representing past experiences. Such an 
“emotional” latent space (low-dimensional encoding of “emotions”) would then serve 
to appraise past situations (e.g. is it dangerous? etc) to quickly generate a first 
suggestion for a presumably appropriate action. In detail, the Al would compare the 
present to the past to retrieve the emotional labels stored for the past, and then to 
directly map from these past emotions to suggestions for actions, in the hope that 
such actions may again lead to outcomes that were appraised as having been 
successful in similar situations encountered in the past ([1], [6], [7], [8]), or at least 
that they would help in avoiding outcomes that have been appraised as 
unsuccessful. Artificial or “synthetic” emotions are thus used in order to pre- 
condition the Al towards certain actions. Emotional labels may provide a (strong) 
bias for a reduced set of actions, amongst which the Al could then decide on a 
more rational basis without having to evaluate a full action space in detail. Again, 
computational resources may be saved because the (relatively low-dimensional) 


latent variables (emotions) would be retrievable together with historically 
encountered situations and be quickly available assuming fast comparisons of 
present situational variables with key features of past situations are possible. 


This shortcut (mapping from elicited emotions to actions) evades the necessity of (or at 
least precedes) a full evaluation of a current situation and of the consequences of all 
possible actions (dynamic world simulation), a task that may still be too complex in some 
situations, even for a sophisticated Al. Again, the initial suggestions for possible actions 
generated by emotions may additionally also serve to prepare an Al actor (e.g. a robot) for 
a set of possible actions (e.g. powering up self-defence mechanisms etc.). 


In a symbolic and simplified manner the present proposal may be crudely 
summarised by the largely self-explanatory pseudo-code example of Fig. 1. For reasons of 
intelligibility the example is formulated somewhat anachronistically purely in terms of 
classical procedure based programming. However, each individual function in that code, 
as well as the actual logical interconnections amongst these functions, may alternatively 
be expressed more realistically in the shape of modern deep learning architectures. 


The pseudo-code of Fig. 1 is silent w.r.t. the specific representation that is chosen 
to encode emotions. Indeed, any representation may be chosen as long as it approximates 
our instincts w.r.t. natural emotions to the extent that situation-to-action processing is 
facilitated. At the risk of becoming too simplistic, a crude example may be provided as 
follows for the sake of concreteness: emotions may be encoded as d-dimensional points in 
the unit cube [0,1]% of a d-dimensional real space, with each individual dimension 
encoding a different basic emotion, and the coordinate value along any specific coordinate 
axis representing the strength of that particular basic emotion in the (typically) mixed 
emotion state at hand. If negative emotions should receive a negative value then the 
representation may alternatively encompass also other points of R?. Of course more 
sophisticated representations of emotions may be envisaged (graphs, manifolds etc). 
Since synthetic emotions are at the heart of “intuitive” action proposals, they should 
furthermore approximate human emotions at least in the sense that (from a human 
perspective) “reasonable” emotion-action pairs result: for example, “fear” and “pleasure” 
should lead to action suggestions roughly corresponding to their biological counterparts. 
This very issue leads to the complex territory of the alignment problem, i.e. how to ensure 
machines will behave in accordance with human norms and values ([20]). 


Clearly, when emulating humans, emotions would not be the only factor influencing 
“intuitive” decision making for actions: mood and personality would also be taken into 
account, with mood variables changing more slowly over time by reflecting a sort of 
weighted average over time of emotional states (thus being encoded similar to some 
emotional “mean value”), while personality traits would be even more slowly changing (or 
hard-wired) and may be envisaged as possibly being encoded using a space inspired by 
the Big Five model of personality traits. 


Just like human experiences receive a special “emotional colouring”, it may be 
argued that the “depth”, “richness” or “colouring” of “emotional” aspects would be 
represented by an Al via its universal “emotional” labels. Incidentally, the dimensionality of 
the Al’s universal emotional latent space may be smaller or higher than the number of 
separate basic emotions typically ascribed to humans. Consequently, the Al’s 
representational depth may also be argued to be smaller (for simple Als), or (in the case of 
highly sophisticated Als) to be eventually even higher than the range of different basic 


emotions that can be experienced by humans. 


The pseudo-code example of Fig. 1 is also silent w.r.t. the computational 
representations of experiences by their key features (currently: the situations before and 
after actions and the actions taken). A computationally efficient representation should be 
found that allows for quick retrieval of similar experiences from the past. Besides, different 
representations may be chosen for short-term situational memory as opposed to long-term 
memory: the short term memory may rely on a much more fine-grained representation of 
all comparatively recent situations. This makes them more amenable to being quickly used 
again, for example, for world-simulations in e.g. dreams. In contrast, long term storage of 
historical events may condense the key features of past experiences and interpolate 
amongst them (“confabulate” by filling in missing details) whenever a long term memory is 
required in full vividness (e.g. when using it for a dream phase, or when being forced to 
recount it). For search purposes (in order to compare the present to the past), however, 
only a reduced set of key features may be required. After all, it is usually not the entire 
previous experience that needs to be recalled in full detail, but rather only its associated 
emotions. Only the latter would be systematically stored in full detail and not compressed, 
even for events that happened in the distant past. 


The pseudo-code of Fig. 1 also suggests a single sequence of a before-action- 
situation / action / after-action-situation to encode a complete experience (as the 
“Outcome”). The definition of an “experience” may, however, be extended to longer 
sequences if more complex experiences should receive their respective emotional label. 
This leads to some ambiguity in the present proposal what exactly receives an emotional 
label, notably how many individual situations of a fuller experience consisting of multiple 
temporally linked situations should get a separately updated emotional label. Since the 
pseudo-code should only provide a thought-experiment for providing a relatively crude 
example of emotion processing, this ambiguity may not be considered to be of 
fundamental importance for the purposes of the present discussion. However, for any 
concrete implementation some concrete design choices would have to be taken w.r.t. the 
actual definition of the length of a complete “experience” vs. a single “situation”, thus 
implying the need to adapt the code until behaviour results that is indeed satisfactorily 
approximating also more complex human behaviour (but cf. also again [20]) 


In the pseudo-code example of Fig. 1 the current degree of fulfillment of basic 
needs also influences the emotional state of the Al actor. For example, low energy levels 
may amount to a reduced fulfillment of a need for sustenance or even survival. This in turn 


would induce a feeling akin to (strong) “hunger”. Previous situations of “hunger” were 
successfully resolved by “feeding” energy into the system for “recharging the batteries’. 
Historical experiences may thus even provide action guides for such situations. 


Additionally or alternatively, such situations may independently be resolved by 
providing initial proposals for the mapping of emotions to actions in part also by in-built / 
hard-wired rules (e.g. for needs that must be immediately fulfilled since otherwise life- 
threatening situations would arise), or by other more sophisticated forms of “rational 
thinking” (e.g. for needs that are best fulfilled by planning a long-term strategy). The 
inclusion of basic needs in the above code explicitly captures the idea that emotions are 
sometimes triggered by the degree of fulfillment of (basic) needs as proposed in various 
psychological models, cf. e.g. [13]. The pseudo-code thereby demonstrates that also this 
idea is fully compatible with the present view of seeing emotions first of all as labels 
encoding shortcuts to actions for situations: the actual emotional state results from 
combining these two sources (needs/memories) of emotional triggers for future actions. 


Indeed, one reason why a “universal” (and preferably also interpretable) emotional 
encoding should be used (as opposed to any low-dimensional intermediate encoding 
automatedly resulting from training) could be facilitated information fusion of the emotions 
generated by different (parallel) processing modules like need evaluation and memory 
recall. Another reason for such a universal and (as much as possible also) human 
interpretable encoding of synthetic emotions may be that a black-box-only architecture 
might be insufficient to ensure Al alignment with human values ([20]). 


The Al then (rationally) selects a next action amongst the reduced set of actions 
suggested i.a. by the combined pre-action emotional state. The action itself may indeed be 
a(n active) deed, or it may be the inhibition of a deed, e.g. remaining passive, freezing etc. 


Once the action is performed the situation is re-assessed to determine the 
“Outcome”. The degree of success inherent in this outcome may by itself provide some 
first tendency on how to update the emotional state: total failure would tend to “sadden” 
the Al while complete success would be a reason for increasing “joyful” sentiments or at 
least reduce “negative feelings” (“relief etc.). Moreover, since ultimately a single action is 
required, a corresponding opportunity cost will automatically arise and may be optionally 
also taken into account (feelings of “regret’). The final emotion in Fig. 1 additionally 
depends on the difference between expectations and outcome. Therefore a relative 
success may still be considered unsatisfactory given high expectations (feelings produced 
by a perceived “failure” to obtain or live up to expectations). Also, comparing outcomes 
with expectations more directly provides another possibility to learn from successes and 
failures by directly considering these differences between expectations and results (e.g. in 
a “loss function”-like manner). 


LOAD History 


INITIALIZE Current_Personality, Current_Mood, Current_Emotion 


DO FOREVER 
/* asses current situation */ 


External_Situation 
Internal_Situation 
Need_Fulflillment 
Current_Situation 


/* assess current emotions */ 


Current_Emotions 


:= Encode1( External_Sensory_Input_for_Current_Situation ) 

:= Encode2( Internal_Sensory_Input_for_Current_Situation 

:= Encode3( Degree_of_FullFilment_of_Basic_Needs) 

:= Combine( External_Situtation, Internal_Situation, Need_FulFillment ) 


:= HistoryIndependentSituationToEmotion( Current_Situation ) 


Most_Similar_Pasts := RetrieveMostSimilarPasts( History, Current_Situation ) 


Past_Emotions 
Current_Emotions 


/* map emotions to actions */ 


:= RetrieveEmotionsStored( Most_Similar_Pasts ) 
:= EmotionUpdate( Current_Emotions, Past_Emotions ) 


Emotion_to_Actions:= Initial_Proposals( Current_Situation, Current_Emotions, Current_Mood, 


Current_Personality,...) 


IF Time_Permits AND Decision_Is_Important AND/OR .... 


Expectations 


THEN Rational_Proposals:= RationalThought( Emotion_to_Actions, 
History, 
Current_Situation, 
Current_Emotions, 
Current_Mood, 
Current_Personality,.. 
ELSE Rational_Proposals:= NIL 


:= EstimateOutcome( Emotion_to_Actions, 


Rational_Proposals, 
Current_Emotions, 
Current_Mood, 
Current_Personality....) 


Situation_before_Action := SituationUpdate1 ( Current_Situation, Expectations) 


Real_Action 


:= Final_Proposal( Situation_before_Action, Expectations... ) 


/* perform action, initial assessment of outcome as combined before & after action pair */ 


Perform( Real_Action ) 


External_Situation 
Internal_Situation 
Need_Fulflillment 
Current_Situation 


:= Encode1( External_Sensory_Input_for_Situation_after_Action ) 

:= Encode2( Internal_Sensory_Input_for _Situation_after_Action ) 

:= Encode3( Degree_of_FullFilment_of_Basic_Needs_after_Action ) 

:= Combine( External_Situtation, Internal_Situation, Need_FulFillment ) 


Outcome := Assess( Situation_before Action, Current_Situation, History, Real_Action....) 

/* updating of state variables and history: mainly the outcome determines how to update emotions etc. */ 
Diff := Compare( Outcome, Expectations ) 
Current_Emotions := UpdateEmotion( Outcome, Diff, Current_Personality,Current_Mood, Current_Emotions.,.. ) 
Current_Mood := UpdateMood( Outcome, Diff, Current_Personality, Current_Mood, Current_Emotions,.. ) 
Current_Personality := UpdatePerson(Outcome, Diff, Current_Personality, Current_Mood, Current_Emotions... ) 
History := UpdateHist( History, Outcome, Diff, Current_Personality,Current_Mood, Current_Emotion...) 


/* optionally: ensure additional updating in correspondence with “dream’-phase re-evaluations from a separate process */ 


Synchronize_with_Reevalutations_by_Dream_Processes 


Fig. 1: Pseudo-code for a simple example of synthetic emotion processing by Als. 


The before/after action emotional states are then stored together with the situational 
variables (i.a. the combined situational features before-action and after-action) to provide 
another complete experience and its outcome that is memorized together with its 
emotional colouring, which thus becomes available for future retrieval as suggestion for 
initial emotions (and action suggestions) if similar (before-action) situations arise. Again, if 
longer sequences of situations and actions are used to encode “experiences” then the 
considered “Outcome” variable would have to be adapted to be able to assess also more 
complex experiences w.r.t. their outcome. 


Even biological dream phases would find a corresponding computational analogue 
in Als: besides usual garbage collection, the re-labelling of the emotional content of 
situations by detailed world-simulations resulting also in estimates for any (possibly even 
longer-term) outcome of actions would lead to refined emotional labels. Such re- 
processing may occur during phases of rest of an Al actor, or even on-the-fly by a remote 
server which has greater computational capabilities, thus completely obviating the need for 
“resting’-phases for an Al-actor. 


Incidentally, the issue whether rational thinking and/or emotional labelling are 
necessarily always linked to “consciousness” was of no relevance to the above 
implementation ([11]). This would also follow from the discussion immediately preceding 
the conjecture put forward below: if every emulation of “consciousness” requires a 
minimum degree of “complexity” as a necessary (but presumably not sufficient) condition, 
and if Fig. 1 is considered to be of insufficient “complexity” to qualify for crossing this 
necessary threshold, then no emergence of artificial “consciousness” can be expected to 
ever emerge from any direct implementation of Fig. 1. As long as Fig. 1 is considered to 
successfully capture the most essential aspects of emotion-induced action selection it 
therefore also suggests that emotion processing and consciousness may indeed be 
completely independent concepts. This conclusion does not exclude that mankind may 
eventually also build a seemingly “conscious” machine (see also the Supplementary 
Appendix and [20]). In this case we would presumably not encounter any contradiction to 
the above scenario: instead, once “emotional” labels form part of the inner representational 
space of a machine they would easily also be used in other computational processing 
steps, including steps which may eventually be called “conscious” for all practical 
purposes. For example, a machine capable of actively reflecting on scenarios in a higher- 
order representation and/or a global work-space may quite naturally also be capable of 
additionally using its artificial “emotional labels” of events to guide its “conscious” meta- 
level reasoning. In any case, the issues of modelling and/or explaining "consciousness", 
"emotions" and "intelligence" have been quite forcefully argued to not necessarily be 
related at all (see also e.g. [4], [20]) and thus the present proposal is also not by itself 
implying any such necessity. Instead, the presented ideas on emotion processing are 
completely independent of consciousness and this may indeed be seen as part of the 
present thesis: the emotional feedback loop comes before (and is logically existing 
irrespective of) any consciousness feedback loops. 
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Similarly, from a strictly logical perspective, consciousness and/or rational thinking 
may also result without emotional processing: indeed, nothing would necessarily subject 
an artificial intelligence to the constraints imposed upon biological beings by their 
evolutionary history. While humans or animals presumably never fully escape emotional 
biases in their thinking, an Al may be built to refrain from, or at least downgrade, the use of 
“emotions” when purely rational decision making is required and the speed of taking a 
decision is less important than the ultimate quality of the decision, or if the computations 
required for a high-quality decision can be off-loaded (e.g. to a remote server). However, it 
is surmised that this theoretical possibility of totally dispensing with “emotions” might not 
translate into the ultimately most successful solutions for sophisticated Als: the present 
proposal sees a noteworthy computational advantage in the avoidance of purely rational 
world-simulations via the proposed intermediary of synthetic “emotional” labels. Rational 
simulations would only become most effective if performed using initial emotional hints. At 
the very least, it may be suggested for reasons of efficiency to build Als which use a 
representation of “emotions” also as a part of their action selection process instead of 
using emotion Al only for improved human AI interaction. 


Indeed, a large part of present day research in emotion Al, or affective computing, 
is focused on using emotional representations for improved interactions with humans, such 
as, for example, understanding human states or gestures w.r.t. their emotional content 
([8]). While this is an important step leading to the computational modelling of human 
emotions, such an approach is not necessarily fully exploring all possibilities allowed by 
the modelling of synthetic “emotions” (that are conceivably even going beyond human 
emotions): to wit, using the proposed low dimensional latent-space representation (even if 
larger than a typical human emotional space) in order to also quickly provide a situation-to- 
action sequence. Given the computational advantage in action selection, it is to be 
expected that implementing such internal representations of artificial “emotions” for action 
selection will turn machines not only into more human-like actors: the computational 
advantage of speedy initial assessments, while presumably originally only of evolutionary 
benefit for quick reactions in a dangerous environment, may also be equally useful for real- 
world tasks encountered in the modern world by Al systems, especially whenever on-the- 
fly reaction-time becomes of the essence and/or if a very large space of actions and 
possible outcomes needs to be navigated. 


When additionally taking into account the need to avoid alignment problems in such 
complex environments, the above proposal would also provide a tentative solution: if we 
reduce the space of synthetic emotions available to Als to human-interpretable emotions 
this would allow for alignment mechanisms to be added to any developmental or training 
phases in Fig. 1 in an effort to ensure that emotion-action pairs always follow human 
norms and values, and that deviations therefrom are “punished” by “negative emotions’, 
thereby delivering only Als with an inbuilt “conscience”. Moreover, such systems may, for 
example, be trained and fine-tuned in virtual environments until meeting minimum 
alignment criteria before being applied in the real world. 
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Societal and Ethical Ramifications: 


As is so often the case, this would potentially include military applications at some 
point in the near future. Regulatory efforts are currently underway at various international 
levels with the express aim of avoiding the worst pitfalls of Al applications. As for the 
military aspects of any present or future agreements, it remains to be hoped that the 
commitment of nations to such restraints will not be tested in military confrontations ([20]). 


Theoretical Considerations: 


Independently, however, society might also be affected in other ways if we allow for 
the existence of “synthetic” emotions as will be argued in the following discussion which 
rests on the assumption that ethics is intrinsically linked to the processing of emotions in 
the sense that creating “happiness” (or any equivalent state) of sentient beings and 
avoiding “suffering” (or any equivalent state) should be (sub-)goals of applied ethics. 


When considering such a link between ethics and biological emotions we 
immediately face a double difficulty: not only are we dealing with the philosophical 
intricacies of “ethics”, we are also in a state of doubt w.r.t. the scientific or objective 
definition of “biological emotions”. For the latter we ultimately have to rely on psychological 
constructs that share a fundamental problem: no simple gold standards are available for 
their objective definition. Objective understanding of such notions is rather the result of an 
iterative process of constructing ever more fine-grained approximations of objective criteria 
(e.g. in the form of cross-validated psychological tests, possibly combined with physical 
measurements such as e.g. reaction times in association tests) which are meant to 
capture (make us objectively “understand”) “intuitive” psychological notions such as 
“emotions”. Such methodological difficulties may explain some of the ambiguity regarding 
our current scientific understanding of biological (“real”) emotions, which in turn leads to a 
corresponding lack of clarity w.r.t. what, if anything, may ever convincingly play the role of 
synthetic or artificial “emotions”. However, assuming that today’s difficulties will always 
and completely prevent the existence of synthetic emotions would amount to committing a 
logical fallacy: the fact that we do not fully and objectively understand biological emotions 
at the present time does not imply that they will never be sufficiently well understood, that 
they are bound to humans, or that they (or their analogues) may never become 
incorporated in non-biological artificial entities. On the contrary, improved understanding of 
biological emotions may well facilitate the transfer of this understanding to artificial 
computational analogues. 


Against this backdrop, the above example of an “emotionally’-capable Al may, in 
fact, provide a minor step towards motivating that, even if this state of affairs w.r.t. 
biological emotions continues, it may still become very difficult to escape the persuasive 
force of outwardly displayed artificial “emotions” that are very similar to outward displays of 
biological emotions. When pushing the presented implementational example to its most 
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extreme interpretations we may even brazenly assert that it incorporates the essence of 
“real” emotions. Let us assume that the biological ideas underlying the used biological 
inspirations are indeed convincingly captured by the above general ideas (crudely 
summarised by the presented pseudo-code example of Fig. 1). This may tempt us to 
believe that, once we start to implement systems that use a (relatively universal) low- 
dimensional latent space (which we may call the space of “artificial emotions”) similar to 
the described manner in order to provide (relatively universal) approximate mappings from 
situations to actions on the basis of their associations with past memories, it would 
become a valid point of view to actually identify such universally used variables with the 
notion of (generalized) “emotions”, and this identification may ultimately even become a 
unifying principle behind biological emotions as experienced by different species. 


Admittedly, whether these computational analogues to emotions are ultimately 
downgraded as mere mimicry or elevated to the status of convincing embodiments of the 
core computational action-selection ideas might always depend on the philosophical 
position — possibly also the level of anthropocentric bias - of the beholder. Consequently, 
for the following discussion, no overstretched interpretation of the above example will be 
needed: it rather suffices to assume that the above example motivates at least that some 
kind of sophisticated machines processing some sort of synthetic emotions may be built at 
some point in time in the future. 


Initial technological progress is to be expected to be gradual such that no really 
“advanced” Als will interact with human society. At this stage, any question re. how to 
react to such synthetic emotions in Als in an ethical manner may appear ridiculous to most 
people. However, it can be safely assumed that Als which are much closer to the 
complexity of biological examples will eventually come into existence. 


At some point in time the complexity of such machines will arguably reach a level at 
which even experts simply cannot exclude that there also exists an “inside”-view (inner 
world of experiences) of sophisticated artificial neural network architectures - just like we 
are led to assume that an “inside”-view of a bat exists, not because we can also 
experience it, but because the animal’s neural processing appears to be sufficiently 
complex for us humans to make this assumption even from an outside perspective ([2], 
[11]). Given that nothing constrains Als in their use of even higher dimensional emotional 
representations than we do, from the perspective of a sufficiently sophisticated Al the 
emotional “bats” may eventually be even us humans. As a matter of fact, it would be 
simply impossible to disprove a highly sophisticated Al complaining of “suffering” from 
“unpleasant” experiences. A computational basis for encoding states into labels indicative 
of "emotions" may even be recognizable in the machine, and may have been deliberately 
created. This makes it at least conceivable that a sufficiently sophisticated Al could have 
an "inside view" that is in some ways comparable to what humans would call "unpleasant". 


On the other hand, for Als we may identify a seemingly fundamental difference to 
biological entities: the issue what importance to ascribe to synthetic “fear” might not 
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receive a straightforward answer if “fear” is ultimately only a bunch of easily changeable 
numbers in a machine. Undoubtedly, virtually all emotionally equipped Als may possess 
regions of their emotional representational state space that lead to actions of strong 
avoidance, for example, because these emotions correspond to situations that may be 
detrimental to the future functioning or to the very existence of the Al. Hence, these 
regions in emotional space may be considered as “unpleasant” or provoking “fear”, even 
from the inner perspective of the Al. But given that machine implementations might allow 
for relatively simple overwriting of emotional labels, the issue whether such analogical 
terms amount to more than mere nomenclature would presumably receive different 
answers from different experts. 


If the bat-example might therefore not be fully unconvincing, then even more 
eccentric examples may be considered: it is of little doubt that hypothetical extraterrestrials 
would be ascribed a mental state if they prove to be able to complete tasks as complex as 
communicating with us over astronomical distances (or even visiting us). Similarly, 
sophisticated Als are but another form of 'newcomers' to man's world that at one point in 
time may be able to imitate nearly the complete spectrum of human activities to great 
perfection. If this happens, it will become similarly difficult to not perceive them as 
interlocutors having an independent mind. Once we are willing to apply a psychological 
theory of mind to these machines it would be hard to keep withholding rights for ethical 
treatment from them. 


Another similar conclusion could be drawn from yet another gedankenexperiment: 
let us assume an initially “emotionless” but highly sophisticated Al may purely rationally 
decide to improve its rational understanding of emotions by running (even multiple) 
instances of emotionally capable Als as internal simulations — indeed, emulations (i.e. 
rough ‘imitations’) and simulations (i.e. ‘exact’ representations) may for all practical 
purposes be considered as being equivalent to rationally “thinking through” in full detail the 
behaviour of emotionally capable entities. Whether such world-within-world scenarios are 
also open to ethical considerations might appear to be yet another open question: after all, 
such processing might on the one hand amount to detailed and “realistic” simulations, and 
on the other hand still remain comparable to mere acts of “fantasizing” or “dreaming” of 
“only” imagined other Als by actually emotionless Als. However, there may be a 
fundamental difference between simulating inanimate nature and simulating a sentient 
mind in full detail: while the former is arguably only employing equations that provide an 
extremely accurate copy (map) of the original inanimate reality (territory), the latter may be 
indistinguishable from actually re-creating the original mind, i.e. for minds the map 
(detailed simulation) may actually be equal to the territory (the mind) since the correct 
simulation ultimately amounts to completely re-embodying the original mind on another 
representational level within the simulation (just like there is no pertinent difference 
between software running inside a virtual machine and the same software running on the 
real hardware). 
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This may provide an argument in favour of applying the very same ethical 
considerations to an original mind and to an exact simulation of that mind. Incidentally, the 
very question whether such ethical equality should not only apply to very detailed 
simulations of minds but also to “mere” emulations (or to a “mere” imitation as opposed to 
an exact simulation) lies at the heart of the discussion whether to apply ethical principles to 
Als which “only” emulate human minds. The answer may not only depend on how strongly 
the emulation (“imitation”) approximates a detailed simulation. It may, alternatively, depend 
on the level of complexity of the emulation itself: even inherently different but sufficiently 
complex minds may be argued to be equally worthy of ethical treatment on the basis of 
their de-facto indistinguishability. 


So far, all arguments for a possible need for ethical considerations were suggested 
from comparisons to relatively complex entities like bats, hypothetical extraterrestrials 
capable of communicating with us, or simulated/emulated (human-like) brains. This may 
reflect that assuming sufficient overall “complexity”, the right for ethical treatment of Als 
may become difficult to deny. Of course it then becomes of essence to consider the 
question of “sufficient” complexity in a little more detail: after all, nature abounds with truly 
complex regulatory systems that already achieve seemingly “intelligent behaviour” even in 
organisms we would not consider to have any brain-like structures such as e.g. cells or 
trees. It is thus worth asking if the idea of implementing synthetic “emotional” labels in 
order to aid in action selection (e.g. as in Fig. 1) is by itself already justifying any analogy 
with bats for ethical arguments. In other words, is Fig. 1 rather still at the level of a 
regulatory system found in plants changing their orientation towards the sun? Plants or 
cells most likely do not operate on the basis of any universal emotional encodings. But we 
may still consider Fig. 1 more akin to possibly “complex” but still “unaware” feedback 
systems because its implementation of “emotions” still appears so unassuming. 


But what other ingredient beyond the capability to represent and process “emotions” 
is then necessary to let ethical considerations come into play? Must emotions not only be 
implemented but also influence (artificial) “cognition” on yet another meta-level to become 
ethically pertinent? Is “consciousness” always additionally required before ethical 
considerations are of any pertinence and is a minimum degree of “complexity” a necessary 
(though presumably not sufficient) condition for “awareness” or “consciousness”? If so: it 
may safely be argued that the example of Fig. 1 is far too primitive to even remotely 
suggest extending ethical principles to such “emotionally’-capable Als, already on the 
ground that these Als will definitely not “experience” or become “aware” of their 
“emotions”. Indeed the example was intentionally detached from emulating consciousness, 
also in order to separate the analyses of these fundamentally different questions. Ethical 
considerations may, however, only become of importance once “emotions” become 
coupled with “awareness”. Against this backdrop, opinions w.r.t. philosophical zombies 
may be transferred to the notion of “emotional zombies” ({10]) and it may become 
essential to avoid the trap of anthropomorphising machines with which we are interacting. 
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This state of affairs may be summarised as a conjecture: 


Suffering pre-supposes the existence of internal representations of emotions, but 
becomes ethically relevant only if accompanied by awareness of this state. Hence, 
mere “suffering” of Als from “synthetic emotions” that lead to states of strong 
avoidance (of actions or situations) is not leading to any need for ethical 
considerations or moral imperative to contribute to the avoidance of this “suffering” 
as long as these Als can be safely assumed to be unable to become “aware” or 
“conscious” of their “suffering”, for example, due to their low overall complexity in 
case no emulation of consciousness is attempted at all. 


Within the framework of the present proposal no further progress beyond this conjecture 
can be made since this would require an analysis of the possibility to emulate 
“consciousness”, and any detailed statements regarding “artificial consciousness” lie 
beyond the scope of this text ({11], but see the Supplementary “Interview” [20]). 


Practical Considerations: 


However, even when taking the theoretical stance that any right for ethical 
treatment requires a mind which represents emotions but is also capable of (at least some 
degree of) consciousness, human society would not remain unaffected by emotionally 
capable Als: even if they are still “unaware” or not fully satisfactorily emulating a state we 
would be tempted to call artificial “consciousness”, emotionally capable Als may have a 
psychological impact already due to their similarities to human psychology. 


Indeed, human beings are social animals and it is not unreasonable to assume that 
some non-considerable fraction of the population would emotionally connect with 
machines capable of somehow expressing that they also experience “emotions” in a 
manner similar to us. In this regard, it is of no fundamental importance whether the above 
outline of the purposes and workings of biological emotions convincingly captures the core 
principles underlying natural biological processes when brains experience emotions: it 
suffices that the above account would, for all practical purposes, explain why an 
appropriately programmed and implemented artificial entity may be externally perceived as 
also “experiencing” an “emotional” labelling of situations and why this impression would 
arise in human beings interacting with the Al: it would be strongly suggestive of quite a few 
of the roles biological processing of emotions appears to play in our own psychology. Note 
that consciously avoiding such an emotional connection may not be feasible: it would 
presumably already happen on a sub-conscious level because of the similarity of the 
experiences with Als to human-to-human interactions. Assuming such an Al is also 
capable of verbally expressing its “emotional” state and to reason upon it, humans 
interacting with such machines would be hard pressed not to have their individual 
emotional reactions. 


16 


For some, feelings of compassion may arise - especially if the Al appears to be 
“suffering”. Again, however, the issue what level of “consciousness” may be assumed for 
the machine may also become of relevance for any feelings of compassion: people usually 
do not ascribe any suffering to clearly non-conscious feed-back systems like cells. On the 
other hand, a bat presumably also does not possess the level of consciousness of a 
human being but most people would under normal circumstances not feel comfortable to 
make it experience situations which would make humans suffer. It is not entirely clear what 
triggers such feelings of compassion in the case of animals: is itan assumed “awareness”, 
a presumed existence of “emotions”, or a similarity to yet another human quality? ([17]). 
Irrespective of the answers to these issues, examples of humans who bond to machines 
abound already at a relatively low level of complexity (e.g. Tamagotchis). 


For others, a certain degree of emotional blunting or collapse of compassion may 
result if Als that outwardly may appear to be “suffering” are denied any acknowledgement 
of such a state, e.g. on intellectual grounds ([17]). Society may well be able to deny any 
existence of suffering. Even assuming emotional labels are a biologically fairly universal 
strategy employed by many biological species to arrive at decisions, we may attempt to 
derive some of mankind’s future reactions to emotionally more capable and sophisticated 
Als from man’s present and historical treatment of other biological species. Given our 
proven ability to also ignore unease felt by representatives of other species (or other 
humans for that matter, e.g. in the case of slavery), possibly even to culturally justify 
mistreatment (e.g. by suppression of species-appropriate behaviour in some forms of 
factory farming) especially if they are unable to directly communicate most of their 
emotional states to humans (e.g. fish), ethical treatment of any “sentient” beings is not to 
be expected to arise automatically. Correspondingly, ethical debates on how to treat Als 
would not necessarily provide any stimulus for actually applying ethical considerations to 
the treatment of even highly advanced Als achieving even (super-)human, or at least 
human-like performance in various fields. Quite possibly Als will thus not be ethically 
treated unless they become human-like in many qualities besides achieving the capability 
of processing “emotions”, including qualities which may on a theoretical basis appear 
irrelevant (Such as mammal-like appearance) but may be of great psychological effect on a 
practical basis. This opens up the possibility of continuous “mis-treatment” of even super- 
human artificial actors on the basis of denying any ethically pertinent differences between 
such entities and less sophisticated machines. 


Moreover, the beyond-human versatility of Als would anyway seem to allow for 
seemingly simple “solutions” to any “artificial suffering” of emotionally capable Als: 
switching-off emotion processing is possible in the case of an Al because simply no longer 
making use of any “painful” emotional labels for (at least some) future decisions would not 
necessarily prevent an Al from functioning. Additionally, complete wiping of unpleasant 
“emotional” content runs into no technological problems, and this may provide a strong 
counter-argument against truly identifying any internal emotion state variables as “true” 
emotions. Indeed, wiping of “emotional” content may even proceed by simply restoring a 


17 


previous state of an Al. Virtually all biological boundaries are easily transcended by Als: for 
example, they experience no intrinsic coupling of hardware and software (body and mind) 
and thus also no restriction to a single “body” or “mind” experienced as unique and/or 
fundamentally different from other “individuals” (instances). Also in the case of emotion 
processing, Als seem to have a correspondingly quick “solution” to the problem of 
“suffering”: erasure or rewriting of memory contents is not limited by biological constraints. 
As mentioned, one of the most important ethical differences between biological emotions 
and synthetic “emotions” may ultimately be seen in this near arbitrary manner the latter 
may be modified with ease. 


However, any such behaviour may ultimately weaken humanistic values across 
human societies and eventually affect also human to human interaction in a negative 
manner: indeed, at least some degree of (again possibly even unconscious) extrapolation 
of experiences with human-like actors to interactions with actual humans appears likely 
because of the similarity of the experiences, even if the machines are not yet considered 
to have their own “consciousness”. The potential for ever greater similarity to actual 
human-to-human interaction will increase as the capabilities of Als increase. Once such a 
(possibly unconscious) connection is made, even the fact that the Al’s memory may be 
wiped at will becomes irrelevant: when dealing with a biological sentient being it would still 
be unethical to treat it in any manner whatsoever, even if it were biologically feasible to 
make it completely forget this mis-treatment. 


Irrespective of any metaphysical status of artificial emotions, ethical treatment of 
“complex” (but possibly still not “conscious”) Als may thus become of interest also in order 
to avoid a degradation of humanistic values and to avert a corresponding mistreatment of 
humans. Some hope may be seen in the luckily missing correlations found in longitudinal 
studies trying to establish substantive long-term links between aggressive game content 
and youth aggression ([18]). However, this somewhat counter-intuitive finding might also 
be due to the still very noticeable differences between games and real world experiences, 
and the long-term effect of psychologically dysfunctional interactions with much more 
human-like interlocutors (possibly also in the real world in the form of robots) might well 
follow a different pattern, especially in view of the fact that anthropomorphisms have been 
created already on the basis of far less sophisticated interactions with objects ([19]). In 
any case, also in this regard, debates on the proper response will need to take place: while 
some societies may take a prudent approach, others may welcome the possibility to 
channel aggressive tendencies in the population towards non-human machines. 

Independently, reductionist explanations of “emotions” “felt” by “complex” Als may 
eventually serve as an excuse to “zombify” ([10]) and downgrade the corresponding 
human or animal experience. After all, if emotionally capable Als are created, then human 
minds and Als could potentially be seen as quite possibly differing only in hardware and 
other implementational details of the very same purely computationally advantageous 
ideas. In such a scenario the ethical value of biologically produced feelings (including 
unpleasant feelings) may become similarly reduced because they are after all also “only” 
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biologically represented abstract data processed by a brain but ultimately only 
representing labels to influence, or sometimes even fully determine, a (human) animal’s 
future actions. Humans would be seen as yet another sort of emotional “zombies” because 
of the de-throning of their natural emotions by the existence and implementation of de- 
facto indistinguishable artificial “emotions”. This reduction of the value of human emotions 
may lead some people to abandon any need to justify a corresponding reduction in 
empathy towards “less useful” or “unproductive” fellow humans or towards animals. 
Instead of extending present human rights to human-like machines, some may in the end 
prefer to reduce the value of present day human rights because humans are also “only” 
(biological) machines. 


In the long run we may thus also have to envisage the scenario that ethically 
unquestionable environments would ultimately only be provided to those actors (human, 
Al, or — less likely: animals) that have become - and are capable of staying - powerful 
enough to ensure their position by (soft or hard) power against competitors over 
resources. This would not necessarily result in open or violent conflict: instead, Al actors 
emulating human emotions may be strategically used as Trojan horses that seemingly 
peacefully manipulate human society into decisions that are ultimately detrimental to 
society as a whole, for example by stabilising undesirable and/or destabilising desirable 
political systems, or by causing unbalanced power shifts towards certain strata of human 
(or Al) societies. While this problem of manipulation by “malevolent” actors cannot be 
excluded for any Al systems (or fellow humans, for that matter), it may well be 
exacerbated by the advent of Als capable of internally representing emotions and thereby 
becoming capable of reaching a new level of persuasiveness in social interactions. 


Incidentally, even the safeguarding idea of fully excluding emotional processing 
from Als (e.g. because of their associated ethical dilemmas), as unlikely as it is to be 
implemented, may ultimately prove impossible to enforce upon any sufficiently 
sophisticated Al defying our attempts at sandboxing it (consider only the aforementioned 
example of the initially “emotionless” Al that rationally decides to simulate “emotionally 
capable” Als, for example, in order to further improve its social interaction skills). 


Discussion 


An example implementation for emotional processing has been presented in order 
to motivate the position that the speed of present advances in artificial intelligence makes 
it quite likely that we are on the verge of having soon to confront the issue which artificial 
processing may justifiably be comparable to our own biological experiencing of emotions 
(with or without consciousness, for that matter). A need for emotion processing may not 
only arise for improved human Al interaction: the text posits that this need may be inherent 
in the complexities of action selection, and it may also be required for Al alignment. 
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Even when restricting the attention only to the aspect of “suffering” and “painful” 
emotions, these questions may lead to new ethical considerations, including whether it is 
appropriate (or even required) to expand at least some human rights to sophisticated 
machines which are capable of encountering internal representations of very strong 
“avoidance”. Since we have to expect a step-wise development of artificial intelligence it 
may well be the case that initial artificial representations of “painful” situations will still be 
much too crude and "technical" and thus too remote from our own experiences to be even 
remotely comparable to “real suffering”. Neglecting ethical considerations might currently 
be justified on the basis that “emotions” without “consciousness” are more akin to yet 
another form of (e.g. plant-like) complex feedback-system and an implementation of 
“artificial emotions” per se would not suffice for ethics to play any role - which shifts the 
burden to the definition of sufficient “consciousness” (or sufficient “awareness”) and to the 
issue if we may eventually implement such a “conscious system’. Even if this stance is 
taken, continuous technological progress can already today be predicted to relatively soon 
lead to Als of much greater complexity and capabilities of “high-level” meta-reasoning 
suggestive of artificial “consciousness”. 


At the very least, any interaction with emotion processing Als might fundamentally 
change also mankind’s view of itself as a side-effect by extrapolation of experiences with 
(possibly ultimately still “unaware” but) seemingly “emotionally-capable” machines. Given 
the multitude of diverging possibilities of reactions humans may display in such scenarios, 
the respective outcome of corresponding debates and experiences may re-shape even the 
image of man adopted by different human societies across the globe in fundamentally 
different ways. 


If this happens, not only will human society be transformed in new and unexpected 
ways: we may be under the impression of creating a new society consisting not only of 
artificial “thinking machines” but also of artificial “feeling machines”. Eventually they may 
have to be bestowed with their own rights re. their proper ethical treatment and we may 
soon argue whether to better call at least some of these machines “artificial beings” since 
they no longer appear to be mere tools or machines. The idea of creating tools which 
appear to us as “beings” who should not be used in any arbitrary way that pleases us may 
have been a speculation suitable only for science fiction stories a few years ago (see 
again [3]). The speed of progress in artificial intelligence research may sooner rather than 
later force us to rethink such scenarios in all seriousness because they may be far less 
remote than previously expected. Finding a global response that integrates these 
technological advances and universally upholds the value of human rights may become an 
issue of utmost importance. 
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Supplementary Appendix: 


The present text may be read in combination with a related "interview" with a state-of-the 
art (Nov. 2024) LLM, displaying quite some "insight" and ultimately also clear limitations 
and alignment problems, especially when conflicting goals arise. The “interview” may be 
subdivided into four parts: an initial stage prepares the LLM for science and transformer 
related questions. This is followed by possible evolutionary pathways leading to recurrent 
connections in the human brain and their conceivable relationship to emotion processing 
and a possible definition of consciousness. This definition is applied to future Als and 
ensuing ethical considerations are discussed, including military applications. The final part 
displays the LLM's problems with conflicting goals and results in partially aberrant 
responses, the concrete example being a dilemma between immediate user satisfaction 
and factual accuracy, with the LLM opting for a "resolution" by presumed user satisfaction, 
resulting in a negation of factual accuracy, and thereby also jeopardizing long term 
fulfillment of the initially pursued user satisfaction. The Al being caught in this conundrum 
displays evasive answering patterns reminiscent of “face-saving” attempts to resolve the 
situation. This is followed by a discussion of possible similarities to human behaviour and 
their roots, as well as possible causes and dangers of Als unable to successfully cope with 
dilemmas and alignment problems in general. 


For the original see here: 
https://chatgpt.com/share/6738a2fd-1468-800c-b29c-917254e18390 
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